Before analysis the lagged correlation by lagci, you may get lots of data. This tutorial will get started with you on processing raw data, until get the data with correct format.
5.1 Get started with example data
Run this chunk, you will get the correct format of the data needed in lagci package. The data structure should have two columns at least, one of the columns is time column with POSIXct format, and another column is value.
you should use this script to correct your format:
This code builds a two-column data frame where time is first created as a character string by concatenating year, month, and the fixed day "01", and value is copied from the original data;
it then shows that time is initially of class "character", converts it to a real timestamp with as.POSIXct() (you can add tz = "UTC" for reproducibility), prints the first few rows to verify the result, and finally confirms that time is now of class c("POSIXct","POSIXt").
In short, it turns separate year/month information into a proper POSIXct time column plus a numeric value column—i.e., the tidy format expected by downstream lagged-correlation tools;
if you want stricter parsing, use zero-padded months via sprintf("%04d-%02d-01", year, month) before the conversion.
time value
1 2000-01-01 0.00000000
2 2000-02-01 0.02020179
3 2000-03-01 0.04039534
4 2000-04-01 0.06057240
5 2000-05-01 0.08072474
6 2000-06-01 0.10084413
class(example_data_2_correct$time)
[1] "POSIXct" "POSIXt"
5.1.2 Type two: omics data format
You may get several files, when you get the omics data:
This chunk creates a toy omics-style wide table: it generates 5 human-readable feature IDs via ids::adjective_animal(), builds a half-hourly POSIXct time vector from 2019-04-29 03:30 to 2019-05-06 21:30, and simulates 5 time series using a noisy sine curve.
The replicate() output is transposed with t() so rows = features, columns = time; column names are set to the timestamps, and head(omics_data[1]) previews the first time column.
For robustness and reproducibility: call set.seed(1) before replicate(), explicitly coerce column names with as.character(time_index), and ensure nrow(omics_data) == nrow(IDs) so IDs match features.
If the ids package isn’t installed, fall back to data.frame(ids = paste0("id_", 1:5)). Later, convert this wide table to a tidy long format (id/time/value) before lagged-correlation analysis.
you should transform them into five files with correct format:
This code converts the omics wide table into per-feature tidy frames. First, omics_data is transposed so rows = time, columns = features (full_data <- omics_data %>% t() %>% as.data.frame()), then column names are set from IDs$ids.
The loop builds df_list: for each feature, it creates a two-column data frame with time parsed from the row names (timestamps) and the corresponding value; head(df_list[[1]]) previews the first feature’s time–value series.